Using Semi-Supervised Learning and Wikipedia to Train an Event Argument Extraction System

نویسندگان

چکیده

The paper presents a methodology for training an event argument extraction system in semi-supervised setting. We use Wikipedia and Wikidata to automatically obtain small noisily labeled dataset large unlabeled dataset. consists of clusters containing pages multiple languages. data is iteratively using learning combined with probabilistic soft logic infer the pseudo-label each example from predictions base learners. proposed applied about earthquakes terrorist attacks cross-lingual Our experiments show improvement results when methodology. achieves F1-score 0.79 only used, 0.84 trained according logic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimally Supervised Event Argument Extraction using Universal Schema

The prediction of events and their participants is an important component of building a knowledge base automatically from text. Typically, the events of interest are domain-specific and not known in advance, and so it is often the case that little or no training data is available to learn the appropriate predictors. In this work, we propose a technique for distantly supervised event argument ex...

متن کامل

Employing Event Inference to Improve Semi-Supervised Chinese Event Extraction

Although semi-supervised model can extract the event mentions matching frequent event patterns, it suffers much from those event mentions, which match infrequent patterns or have no matching pattern. To solve this issue, this paper introduces various kinds of linguistic knowledge-driven event inference mechanisms to semi-supervised Chinese event extraction. These event inference mechanisms can ...

متن کامل

Self-Train LogitBoost for Semi-supervised Learning

Semi-supervised classification methods are based on the use of unlabeled data in combination with a smaller set of labeled examples, in order to increase the classification rate compared with the supervised methods, in which the total training is executed only by the usage of labeled data. In this work, a self-train Logitboost algorithm is presented. The self-train process improves the results ...

متن کامل

Semi-supervised Learning Using an Unsupervised Atlas

In many machine learning problems, high-dimensional datasets often lie on or near manifolds of locally low-rank. This knowledge can be exploited to avoid the “curse of dimensionality” when learning a classifier. Explicit manifold learning formulations such as lle are rarely used for this purpose, and instead classifiers may make use of methods such as local co-ordinate coding or auto-encoders t...

متن کامل

Relation Extraction Using Label Propagation Based Semi-Supervised Learning

Shortage of manually labeled data is an obstacle to supervised relation extraction methods. In this paper we investigate a graph based semi-supervised learning algorithm, a label propagation (LP) algorithm, for relation extraction. It represents labeled and unlabeled examples and their distances as the nodes and the weights of edges of a graph, and tries to obtain a labeling function to satisfy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Informatica

سال: 2022

ISSN: ['0350-5596', '1854-3871']

DOI: https://doi.org/10.31449/inf.v46i1.3577